

First Semester M.Tech. Degree Examination, Dec.09/Jan.10 Computer Architecture

Time: 3 hrs.

Max. Marks:100

05SCS11

## Note: Answer any FIVE full questions.

- 1 a. With neat block diagrams explain the classification of computer architecture based on the notions of instruction and data streams. (06 Marks)
  - b. With relevant block diagram, explain any two shared-memory multiprocessor model. (08 Marks)
  - c. The following code segment, consisting of six instructions, needs to be executed 64 times for the evaluation of vector arithmetic expression:  $D(I) = A(I) + B(I) \times C(I)$  for  $0 \le I \le 63$ .

R1, R2 and R3 are CPU registers, (R1) is the content of R1,  $\alpha$ ,  $\beta$ ,  $\gamma$  and  $\theta$  are the starting memory addresses of arrays B(I), C(I), A(I) and D(I) respectively. Assume four clock cycles for each load or store, two cycles for the add and eight cycles for the multiply on either a uniprocessor or a single PE in an SIMD machine.

- i) Calculate the total number of CPU cycles needed to execute the above code segment repeatedly 64 times on an SISD uniprocessor computer sequentially, ignoring all other time delays.
- Consider the use of an SIMD computer with 64 PEs to execute the above vector operations in six synchronized vector instructions over 64-component vector data and both driven by the same – speed clock. Calculate the total execution time on the SIMD machine, ignoring instruction broadcast and other delays.
- iii) What is speedup gain of the SIMD over SISD computer? (06 Marks)
- 2 a. With the help of block diagram, explain the principle of data forwarding in define-use and load-use conflicts. (10 Marks)
  - b. The following instructions are executed using a 5-stage pipeline. List all the data dependencies present among these instructions. (05 Marks)

| i1 : | Load | r1, a      |
|------|------|------------|
| i2 : | add  | r3, r1, r2 |
| i3 : | mul  | r4, r5, r6 |
| i4 : | add  | r5, r4, r8 |
| i5 : | mul  | r5, r7, r9 |

- c. Consider the execution of a program of 15,000 instructions by a linear pipeline processor with a clock rate 25 MHz. Assume that the instruction pipeline has 5 stages and that one instruction is issued per clock cycle. The penalties due to branch instructions and out-of-sequence execution are ignored.
  - i) Calculate the speedup factor in using this pipeline to execute the program as compared with the use of an equivalent nonpipelined processor with an equal amount of flow-through delay.
    - $(0.5 M_{\rm embr})$

- Istinguish CSA from CPA. With block diagram, explain the pipeline unit for fixed point sultuplication of 8-bit integers (using 'Wallace tree' of CSA and CPA).
  (10 Marks) the pipeline reservation table shown Fig.3(b).
  - List the forbidden latencies
  - Draw the state transition diagram.
  - List all the simple cycles and greedy cycles.
  - Determine the minimal average latency
    - Determine the upper bound on the MAL.

1 Stages Time ς 5 3 2 4 Х X S, Х  $S_2$ Х Х 53 X Fig.3(b).

The reference to superscalar processor, explain the sequential consistency models.

(10 Marks) A th a neat block diagram, explain the BTAC scheme for accessing branch targets. (10 Marks)

A any three important characteristic features of a general Fine-grained SIMD architecture.
 A an eat block diagram explain the MPP architecture.
 B considering matrix multiplication application, explain the systolic architecture. Also, draw the block diagram of PE and system connection for matrix multiplication. (10 Marks)

. within the cache coherence problem. Explain any three reasons that causes cache (10 Marks) (10 Marks)

and the snoopy bus protocol for maintaining cache consistency in write-through and (10 Marks)

. Fram the principle of data flow architecture and distinguish the data flow computers from In-Neuman computers. (10 Marks)

common the architecture of a VLIW processor and its pipeline operations. (10 Marks)

so the short notes on:

openning in superscalar processor.

⇔ser PC 620.

s

second generation multicomputers.

dSC processor.

(20 Marks)

\* \* \* \* \*

(10 Marks)